28 research outputs found
Mean Estimation from One-Bit Measurements
We consider the problem of estimating the mean of a symmetric log-concave
distribution under the constraint that only a single bit per sample from this
distribution is available to the estimator. We study the mean squared error as
a function of the sample size (and hence the number of bits). We consider three
settings: first, a centralized setting, where an encoder may release bits
given a sample of size , and for which there is no asymptotic penalty for
quantization; second, an adaptive setting in which each bit is a function of
the current observation and previously recorded bits, where we show that the
optimal relative efficiency compared to the sample mean is precisely the
efficiency of the median; lastly, we show that in a distributed setting where
each bit is only a function of a local sample, no estimator can achieve optimal
efficiency uniformly over the parameter space. We additionally complement our
results in the adaptive setting by showing that \emph{one} round of adaptivity
is sufficient to achieve optimal mean-square error
Mean Estimation from Adaptive One-bit Measurements
We consider the problem of estimating the mean of a normal distribution under
the following constraint: the estimator can access only a single bit from each
sample from this distribution. We study the squared error risk in this
estimation as a function of the number of samples and one-bit measurements .
We consider an adaptive estimation setting where the single-bit sent at step
is a function of both the new sample and the previous acquired bits.
For this setting, we show that no estimator can attain asymptotic mean squared
error smaller than times the variance. In other words,
one-bit restriction increases the number of samples required for a prescribed
accuracy of estimation by a factor of at least compared to the
unrestricted case. In addition, we provide an explicit estimator that attains
this asymptotic error, showing that, rather surprisingly, only times
more samples are required in order to attain estimation performance equivalent
to the unrestricted case
The minimax risk in testing the histogram of discrete distributions for uniformity under missing ball alternatives
We consider the problem of testing the fit of a discrete sample of items from
many categories to the uniform distribution over the categories. As a class of
alternative hypotheses, we consider the removal of an ball of radius
around the uniform rate sequence for . We deliver a sharp
characterization of the asymptotic minimax risk when as the
number of samples and number of dimensions go to infinity, for testing based on
the occurrences' histogram (number of absent categories, singletons,
collisions, ...). For example, for and in the limit of a small expected
number of samples compared to the number of categories (aka
"sub-linear" regime), the minimax risk asymptotes to , with the
normal survival function. Empirical studies over a range of problem parameters
show that this estimate is accurate in finite samples, and that our test is
significantly better than the chisquared test or a test that only uses
collisions. Our analysis is based on the asymptotic normality of histogram
ordinates, the equivalence between the minimax setting to a Bayesian one, and
the reduction of a multi-dimensional optimization problem to a one-dimensional
problem
Separating the Human Touch from AI-Generated Text using Higher Criticism: An Information-Theoretic Approach
We propose a method to determine whether a given article was entirely written
by a generative language model versus an alternative situation in which the
article includes some significant edits by a different author, possibly a
human. Our process involves many perplexity tests for the origin of individual
sentences or other text atoms, combining these multiple tests using Higher
Criticism (HC). As a by-product, the method identifies parts suspected to be
edited. The method is motivated by the convergence of the log-perplexity to the
cross-entropy rate and by a statistical model for edited text saying that
sentences are mostly generated by the language model, except perhaps for a few
sentences that might have originated via a different mechanism. We demonstrate
the effectiveness of our method using real data and analyze the factors
affecting its success. This analysis raises several interesting open challenges
whose resolution may improve the method's effectiveness
Higher Criticism for Discriminating Word-Frequency Tables and Testing Authorship
We adapt the Higher Criticism (HC) goodness-of-fit test to measure closeness
between word-frequency tables. We apply this measure to authorship attribution
challenges, where the goal is to identify the author of a document using other
documents whose authorship is known. The method is simple yet performs well
without handcrafting and tuning; reporting accuracy at the state of the art
level in various current challenges. As an inherent side effect, the HC
calculation identifies a subset of discriminating words. In practice, the
identified words have low variance across documents belonging to a corpus of
homogeneous authorship. We conclude that in comparing the similarity of a new
document and a corpus of a single author, HC is mostly affected by words
characteristic of the author and is relatively unaffected by topic structure.Comment: under review (AOAS
Unification of Rare/Weak Detection Models using Moderate Deviations Analysis and Log-Chisquared P-values
Rare/Weak models for multiple hypothesis testing assume that only a small
proportion of the tested hypotheses concern non-null effects and the individual
effects are only moderately large, so that they generally do not stand out
individually, for example in a Bonferroni analysis. Such rare/weak models have
been studied in quite a few settings, for example in some cases studies focused
on underlying Gaussian means model for the hypotheses being tested; in some
others, Poisson. It seems not to have been noticed before that such seemingly
different models have asymptotically the following common structure:
Summarizing the evidence each test provides by the negative logarithm of its
P-value, previous rare/weak model settings are asymptotically equivalent to
detection where most negative log P-values have a standard exponential
distribution but a small fraction of the P-values might have an alternative
distribution which is moderately larger; we do not know which individual tests
those might be, or even if there are any such. Moreover, the alternative
distribution is noncentral chisquared on one degree of freedom. We characterize
the asymptotic performance of global tests combining these P-values in terms of
the chisquared mixture parameters: the scaling parameters controlling
heteroscedasticity, the non-centrality parameter describing the effect size
whenever it exists, and the parameter controlling the rarity of the non-null
effects. Specifically, in a phase space involving the last two parameters, we
derive a region where all tests are asymptotically powerless. Outside of this
region, the Berk-Jones and the Higher Criticism tests have maximal power.
Inference techniques based on the minimal P-value, false-discovery rate
controlling, and Fisher's test have sub-optimal asymptotic phase diagrams. We
provide various examples for multiple testing problems of the said common
structure.Comment: 32 pages, 2 figures, submitte